Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash on continue after a missing package non-critical error #5942

Merged

Conversation

M4rtinK
Copy link
Contributor

@M4rtinK M4rtinK commented Oct 16, 2024

Due to the way how exceptions propagate from the Anaconda installation tasks we ended up with a non-critical error exception interrupting the iteration of an important task iteration loop installing the payload.

Due to not being handled "deep enough" but instead "bubbling up" to a top level error handler the loop apparently gets interrupted & remaining tasks skipped.

This resulted in an unrelated crash ("kernel version list no available") due to all the important installation tasks being skipped, packages not being installed and installation related data not being populated.

The end result was that a non-critical error (such as a missing package) would trigger a dialog asking the user to quit or continue - but clicking "continue" would result in a weird crash.

So move the error handler check closer to the task execution, to prevent the loop from being interrupted. That way the loop will resume its iteration if "continue" is clicked in the UI.

Also convert the non-critical error if it gets raised (user deciding not to continue after the non-critical error) to a fatal one. This is necessary, as otherwise the top level error handler would get triggered, asking the user again to quit or continue.

NOTE: Longer term we really should clean this up & have all installation tasks gathered, ordered and executed from a single place. Then all the error handling could be in a single place, making things much simpler.

(cherry picked from commit 73b412f)

Resolves: RHEL-57699
Resolves: INSTALLER-4045
Related: rhbz#2238045

Backport of #5937 to the RHEL 10 branch.

Due to the way how exceptions propagate from the Anaconda installation tasks we ended
up with a non-critical error exception interrupting the iteration of an important task
iteration loop installing the payload.

Due to not being handled "deep enough" but instead "bubbling up" to a top level
error handler the loop apparently gets interrupted & remaining tasks
skipped.

This resulted in an unrelated crash ("kernel version list no available")
due to all the important installation tasks being skipped, packages not
being installed and installation related data not being populated.

The end result was that a non-critical error (such as a missing package)
would trigger a dialog asking the user to quit or continue - but clicking
"continue" would result in a weird crash.

So move the error handler check closer to the task execution, to
prevent the loop from being interrupted. That way the loop will resume
its iteration if "continue" is clicked in the UI.

Also convert the non-critical error if it gets raised (user deciding
*not* to continue after the non-critical error) to a fatal one. This is
necessary, as otherwise the top level error handler would get triggered,
asking the user again to quit or continue.

NOTE: Longer term we really should clean this up & have all installation
tasks gathered, ordered and executed from a single place. Then all the
error handling could be in a single place, making things much simpler.

(cherry picked from commit 73b412f)

Resolves: RHEL-57699
Resolves: INSTALLER-4045
Related: rhbz#2238045
@M4rtinK
Copy link
Contributor Author

M4rtinK commented Oct 16, 2024

/kickstart-test --testtype smoke

1 similar comment
@M4rtinK
Copy link
Contributor Author

M4rtinK commented Oct 17, 2024

/kickstart-test --testtype smoke

@M4rtinK
Copy link
Contributor Author

M4rtinK commented Oct 17, 2024

/kickstart-test --skip-testtypes whatever

@M4rtinK
Copy link
Contributor Author

M4rtinK commented Oct 18, 2024

Wow, all 120 test passed with 5 retries - seems to be working, thanks @rvykydal! :)

I've also checked the results c for retried tests both sets of results are there, so if we wanted to look for race conditions, we still can do that even with retries on. )

Copy link
Member

@jkonecny12 jkonecny12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

# This results in a nice error dialog with "Exit Installer" button
# being shown.
raise PayloadInstallationError(str(e)) from e
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I dont get why we need to check fro ERROR_RAISE, handle broad Exception and raise again.
If we dont handle it it will be just propagate up. What am I missing here?

I mean convert this to::

try:
    sync_run_task(task_proxy)
except NonCriticalInstallationError as e:
    # Handle the non-critical error by raising it as a fatal PayloadInstallationError
    raise PayloadInstallationError(str(e)) from e

@M4rtinK M4rtinK added the ready to merge The PR can be merged. It should have all BZ flags required for releasing set (usually release+). label Oct 22, 2024
@M4rtinK M4rtinK merged commit bb02888 into rhinstaller:rhel-10 Oct 23, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge The PR can be merged. It should have all BZ flags required for releasing set (usually release+). rhel-10
Development

Successfully merging this pull request may close these issues.

3 participants